26 research outputs found

    Reduced bootstrap for the median

    Get PDF
    In this paper we study a modified bootstrap that consists of only considering those bootstrap samples satisfying k1 ≤ νn ≤ k2, for some 1 ≤ k1 ≤ k2 ≤ n, where νn is the number of distinct original observations in the bootstrap sample. We call it reduced bootstrap, since it only uses a portion of the set of all possible bootstrap samples. We show that, under some conditions on k1 and k2, the reduced bootstrap consistently estimates the distribution and the variance of the sample median. Unlike the ordinary bootstrap, the reduced bootstrap variance estimator does not require conditions on the population generating the data to be a consistent estimator, but does rely an adequate choice of k1 and k2. Since several choices of k1 and k2 yield consistent estimators, we compare the finite sample performance of the corresponding estimators through a simulation study. The simulation study also considers consistent variance estimators proposed by other authors.Ministerio de Educación y Cienci

    Time series clustering for estimating particulate matter contributions and its use in quantifying impacts from deserts

    Get PDF
    Source apportionment studies use prior exploratory methods that are not purpose-oriented and receptor modelling is based on chemical speciation, requiring costly, time-consuming analyses. Hidden Markov Models (HMMs) are proposed as a routine, exploratory tool to estimate PM10 source contributions. These models were used on annual time series (TS) data from 33 background sites in Spain and Portugal. HMMs enable the creation of groups of PM10 TS observations with similar concentration values, defining the pollutant's regimes of concentration. The results include estimations of source contributions from these regimes, the probability of change among them and their contribution to annual average PM10 concentrations. The annual average Saharan PM10 contribution in the Canary Islands was estimated and compared to other studies. A new procedure for quantifying the wind-blown desert contributions to daily average PM10 concentrations from monitoring sites is proposed. This new procedure seems to correct the net load estimation from deserts achieved with the most frequently used method

    A Monte Carlo comparison of three consistent bootstrap procedures

    Get PDF
    Since bootstrap samples are simple random samples with replacement from the original sample, the information content of some bootstrap samples can be very low. To avoid this fact, some authors have proposed several variants of the classical bootstrap. In this paper we consider two of them: the sequential or Poisson bootstrap and the reduced bootstrap. Both of them, like ordinary bootstrap, can yield second order accurate distribution estimators, that is, the three bootstrap procedures are asymptotically equivalent. The question that naturally arises is which of them should be used in a practical situation, in other words, which of them should be used for finite sample sizes. To try to answer this question, we have carried out a simulation study. Although no method was found to exhibit best performance in all the considered situations, some recommendations are given.Ministerio de Educación y Cienci

    Caso prático: A análise dos problemas financeiros da criação de microempresas com a ajuda de máquinas de ve tores de suporte

    Get PDF
    Despite the leading role that micro-entrepreneurship plays in economic development, and the high failure rate of microenterprise start-ups in their early years, very few studies have designed financial distress models to detect the financial problems of micro-entrepreneurs. Moreover, due to a lack of research, nothing is known about whether non-financial information and nonparametric statistical techniques improve the predictive capacity of these models. Therefore, this paper provides an innovative financial distress model specifically designed for microenterprise startups via support vector machines (SVMs) that employs financial, non-financial, and macroeconomic variables. Based on a sample of almost 5,500 micro-entrepreneurs from a Peruvian Microfinance Institution (MFI), our findings show that the introduction of non-financial information related to the zone in which the entrepreneurs live and situate their business, the duration of the MFI-entrepreneur relationship, the number of loans granted by the MFI in the last year, the loan destination, and the opinion of experts on the probability that microenterprise start-ups may experience financial problems, significantly increases the accuracy performance of our financial distress model. Furthermore, the results reveal that the models that use SVMs outperform those which employ traditional logistic regression (LR) analysis.A pesar del destacado papel que desempeña el microemprendimiento en el desarrollo económico y de la alta tasa de quiebra que tienen las nuevas microempresas en sus primeros años de vida, muy pocos estudios han diseñado un modelo para detectar las dificultades financieras de los microemprendedores. Además, debido a la ausencia de investigaciones, no se conoce nada acerca de si la información no financiera y las técnicas estadísticas no paramétricas mejoran la capacidad predictiva de estos modelos. Por tanto, este artículo proporciona un innovador modelo para detectar las dificultades financieras específicamente diseñado para las microempresas de nueva creación mediante el uso de máquinas de soporte vectorial (MSV ) y empleando variables financieras, no financieras y macroeconómicas. Basados en una muestra de casi 5.500 de una Institución Microfinanciera (IM F) peruana, nuestros hallazgos muestran que la introducción de información no financiera relacionada con la zona en la que el emprendedor vive y localiza su negocio, la duración de la relación IM F-emprendedor, el número de préstamos concedidos por la IM F en el último año, el destino del préstamo y la opinión de los expertos sobre la probabilidad de que la nueva microempresa experimente problemas financieros, aumentan de manera significativa la precisión de nuestro modelo de detección de dificultades financieras. Además, los resultados revelan que los modelos construidos usando MVS superan los obtenidos por aquellos modelos que emplean el tradicional análisis de regresión logística.Malgré le rôle important que joue le micro-entreprenariat dans le développement économique, et le taux élevé d’échec des nouvelles micro-entreprises dans leurs premières années d’existence, très peu d’études ont élaboré un modèle pour détecter les difficultés financières des micro-entrepreneurs. De plus, étant donné l’absence de travaux de recherche nous ne savons aucunement si l’information non financière et les techniques non paramétriques améliorent la capacité prédictive de ces modèles. Par conséquent, cet article propose un modèle innovant pour détecter les détresses financières, spécialement conçu pour les micro-entreprises qui viennent d’être créées par l’utilisation de machines à vecteurs de support (MVS ) et en utilisant des variables financières, non financières et macroéconomiques. Nous basant sur un échantillon de près de 5.500 microentrepreneurs d’une Institution Micro-Financière (IM F) péruvienne, nos résultats montrent que l’introduction d’informations non financières liées à la zone où l’entrepreneur vit et situe son affaire, à la durée de la relation IMF–entrepreneur, au nombre de prêts accordés par l’IM F au cours de la dernière année, à la destination du prêt et l’avis des experts sur la probabilité que la nouvelle micro-entreprise connaisse des problèmes financiers, augmentent de manière significative la précision de notre modèle de détection de difficultés financières. De plus, les résultats montrent que les modèles construits en utilisant des MVS dépassent ceux obtenus par les modèles qui utilisent l’analyse traditionnelle de régression logistique.Apesar do destacado papel que o microempreendimento desempenha no desenvolvimento econômico e da alta taxa de falências que as novas microempresas têm nos seus primeiros anos de vida, poucos estudos têm projetado um modelo para detectar as dificuldades financeiras dos microempreendedores. Além disso, devido à ausência de pesquisas, não se sabe nada sobre se a informação não financeira e as técnicas estatísticas não paramétricas melhoram a capacidade preditiva destes modelos. Portanto, este artigo proporciona um inovador modelo para detectar as dificuldades financeiras especificamente projetado para as microempresas de criação recente mediante o uso de máquinas de vetores suportes (MVS ) e utilizando variáveis financeiras, não financeiras e macroeconômicas. Baseados em uma amostra de quase 5.500 microempresas de uma micro-instituição financeira (IM F) peruana, encontramos que a introdução de informação não financeira relacionada com a região na qual o empreendedor mora e localiza o seu negócio, a duração da relação IM F- empreendedor, o número de empréstimos concedidos pela IM F no último ano, a destinação do empréstimo e a opinião dos peritos sobre a probabilidade de a nova microempresa ter problemas financeiros aumentam significativamente a precisão do nosso modelo de detecção de dificuldades financeiras. Além do mais, os resultados revelam que os modelos construídos utilizando MVS ultrapassam os obtidos por aqueles modelos que utilizam a tradicional análise de regressão logística

    Modelling background air pollution exposure in urban environments: Implications for epidemiological research

    Get PDF
    Background pollution represents the lowest levels of ambient air pollution to which the population is chronically exposed, but few studies have focused on thoroughly characterizing this regime. This study uses clustering statistical techniques as a modelling approach to characterize this pollution regime while deriving reliable information to be used as estimates of exposure in epidemiological studies. The background levels of four key pollutants in five urban areas of Andalusia (Spain) were characterized over an 11-year period (2005e2015) using four widely-known clustering methods. For each pollutant data set, the first (lowest) cluster representative of the background regime was studied using finite mixture models, agglomerative hierarchical clustering, hidden Markov models (hmm) and k-means. Clustering method hmm outperforms the rest of the techniques used, providing important estimates of exposures related to background pollution as its mean, acuteness and time incidence values in the ambient air for all the air pollutants and sites studied

    Review and Comparison of Intelligent Optimization Modelling Techniques for Energy Forecasting and Condition-Based Maintenance in PV Plants

    Get PDF
    Within the field of soft computing, intelligent optimization modelling techniques include various major techniques in artificial intelligence. These techniques pretend to generate new business knowledge transforming sets of "raw data" into business value. One of the principal applications of these techniques is related to the design of predictive analytics for the improvement of advanced CBM (condition-based maintenance) strategies and energy production forecasting. These advanced techniques can be used to transform control system data, operational data and maintenance event data to failure diagnostic and prognostic knowledge and, ultimately, to derive expected energy generation. One of the systems where these techniques can be applied with massive potential impact are the legacy monitoring systems existing in solar PV energy generation plants. These systems produce a great amount of data over time, while at the same time they demand an important e ort in order to increase their performance through the use of more accurate predictive analytics to reduce production losses having a direct impact on ROI. How to choose the most suitable techniques to apply is one of the problems to address. This paper presents a review and a comparative analysis of six intelligent optimization modelling techniques, which have been applied on a PV plant case study, using the energy production forecast as the decision variable. The methodology proposed not only pretends to elicit the most accurate solution but also validates the results, in comparison with the di erent outputs for the di erent techniques

    e-Encuestas Probabilísticas II. Los Métodos de Muestreo Probabilístico

    Get PDF
    En este trabajo se aborda fundamentalmente el estudio de las encuestas que utilizan la herramienta de Internet para su realización. En concreto su objetivo se centra en el planteamiento y desarrollo de diseños muestrales probabilísticos que permitan realizar encuestas desde la World Wide Web con el rigor necesario para poder inferir los resultados obtenidos a la población objeto de estudio, con determinada fiabilidad.In this work there is approached fundamentally the study of the surveys that use the tool of Internet for its accomplishment. We centres on the exposition and development of probabilistic sampling designs that allow to realize surveys from the World Wide Web with the necessary accuracy to be able to infer the results obtained to the population under study, with certain reliability

    A new approach to influence analysis in linear models

    Get PDF
    propose a new approach to the study of influence in the General Linear Model based on conditional bias. This approach enables us to apply such an analysis to all particular cases of this model. The theoretical foundation, on which this approach is based, does not presuppose a particular hypothesis on the distribution of the variables. Applying the results obtained to the Multiple Linear Regression Model, measures of influence are obtained as already proposed by other authors. Finally we carry out an application of the results on the analysis of covariance

    Modeling the Financial Distress of Microenterprise Start- Ups Using Support Vector Machines: A Case Study

    Get PDF
    Despite the leading role that micro-entrepreneurship plays in economic development, and the high failure rate of microenterprise start-ups in their early years, very few studies have designed financial distress models to detect the financial problems of micro-entrepreneurs. Moreover, due to a lack of research, nothing is known about whether non-financial information and non-parametric statistical techniques improve the predictive capacity of these models. Therefore, this paper provides an innovative financial distress model specifically designed for microenterprise startups via support vector machines (SVMs) that employs financial, non-financial, and macroeconomic variables. Based on a sample of almost 5,500 micro-entrepreneurs from a Peruvian Microfinance Institution (MFI), our findings show that the introduction of non-financial information related to the zone in which the entrepreneurs live and situate their business, the duration of the MFI-entrepre-neur relationship, the number of loans granted by the MFI in the last year, the loan destination, and the opinion of experts on the probability that microenterprise start-ups may experience financial problems, significantly increases the accuracy performance of our financial distress model. Furthermore, the results reveal that the models that use SVMs outperform those which employ traditional logistic regression (LR) analysis.A pesar del destacado papel que desempeña el microemprendimiento en el desarrollo económico y de la alta tasa de quiebra que tienen las nuevas microempresas en sus primeros años de vida, muy pocos estúdios han diseñado un modelo para detectar las dificultades financieras de los microemprendedores Además, debido a la ausencia de investigaciones, no se conoce nada acerca de si la información no financiera y las técnicas estadísticas no paramétricas mejoran la capacidad predictiva de estos modelos. Por tanto, este artículo proporciona un innovador modelo para detectar las dificultades financieras específicamente diseñado para las microempresas de nueva creación mediante el uso de máquinas de soporte vectorial (MSV) y empleando variables financieras, no financieras y macroeconómicas. Basados en una muestra de casi 5.500 de una Institución Mi-crofinanciera (IMF) peruana, nuestros hallazgos muestran que la introducción de información no financiera relacionada con la zona en la que el emprendedor vive y localiza su negocio, la duración de la relación IMF-emprendedor, el número de préstamos concedidos por la IMF en el último año, el destino del préstamo y la opinión de los expertos sobre la probabilidad de que la nueva microempresa experimente problemas financieros, aumentan de manera significativa la precisión de nuestro modelo de detección de dificultades financieras. Además, los resultados revelan que los modelos construidos usando MVS superan los obtenidos por aquellos modelos que emplean el tradicional análisis de regresión logística.Apesar do destacado papel que o microempreendimento desempenha no desenvolvimento económico e da alta taxa de falências que as novas microempresas têm nos seus primeiros anos de vida, poucos estudos têm projetado um modelo para detectar as dificuldades financeiras dos microempreendedores. Além disso, devido à ausência de pesquisas, não se sabe nada sobre se a informação não financeira e as técnicas estatísticas não paramétricas melhoram a capacidade preditiva destes modelos. Portanto, este artigo proporciona um inovador modelo para detectar as dificuldades financeiras especificamente projetado para as microempresas de criação recente mediante o uso de máquinas de vetores suportes (MVS) e utilizando variáveis financeiras, não financeiras e macroeconómicas Baseados em uma amostra de quase 5.500 microempresas de uma micro-insti-tuição financeira (IMF) peruana, encontramos que a introdução de informação não financeira relacionada com a região na qual o empreendedor mora e localiza o seu negócio, a duração da relação IMF- empreendedor, o número de empréstimos concedidos pela IMF no último ano, a destinação do empréstimo e a opinião dos peritos sobre a probabilidade de a nova microempresa ter problemas financeiros aumentam significativamente a precisão do nosso modelo de detecção de dificuldades financeiras. Além do mais, os resultados revelam que os modelos construídos utilizando MVS ultrapassam os obtidos por aqueles modelos que utilizam a tradicional análise de regressão logística
    corecore